Measuring classification performance: the hmeasure package
نویسنده
چکیده
The ubiquity of binary classification problems has given rise to a prolific literature dedicated to the proposal of novel classification methods, as well as the incremental improvement of existing ones, that largely relies on effective performance metrics. The inherent trade-off between false positives and false negatives, however, complicates any attempt at providing a scalar summary of classification performance. Several measures have been proposed in the literature to overcome this difficulty, prominent among which is the Area Under the ROC Curve (AUC), implementations of which are widely available. The AUC has recently come under criticism for handling the aforementioned tradeoff in a fundamentally incoherent manner, in the sense that it treats the relative severities of misclassifications differently when different classifiers are used. A coherent alternative was recently proposed, known as the H measure, that can optionally accommodate expert knowledge regarding misclassification costs, whenever that is available. The hmeasure package computes and reports the H measure alongside most commonly used alternatives, including the AUC. It also provides convenient plotting routines that yield insights into the differences and similarities between the various metrics.
منابع مشابه
Investigate Factors Affecting on the Performance of Agricultural Machinery Companies Based on Taxonomy Algorithm
Taxonomy(general), the practice and science of classification of things or concepts, including the principles that underlie such classification. Economic taxonomy, a system of classification for economic activity. The main objective of the study was to find whether financial ratios affect the performance of the Agricultural Machinery companies in Iran. A firm performance evaluation and its comp...
متن کاملstepwiseCM: An R Package for Stepwise Classification of Cancer Samples Using Multiple Heterogeneous Data Sets
This paper presents the R/Bioconductor package stepwiseCM, which classifies cancer samples using two heterogeneous data sets in an efficient way. The algorithm is able to capture the distinct classification power of two given data types without actually combining them. This package suits for classification problems where two different types of data sets on the same samples are available. One of...
متن کاملDesign and Construction of an Aerosol Particle Classification System Based on Electrical Mobility
Introduction: The application of particles’ electrical mobility in the electric field has always been an important concern, as the functional basis of a number of particle measuring and classification instrumentations. The objective of this study was to design and construct an aerosol particles classification system using electrical mobility feature in laboratory scale. Methodology: This labo...
متن کاملBody Mass Index Classification based on Facial Features using Machine Learning Algorithms for utilizing in Telemedicine
Background and Objectives: Due to the impact of controlling BMI on life, BMI classification based on facial features can be used for developing Telemedicine systems and eliminating the limitations of measuring tools, especially for paralyzed people. So that physicians can help people online during the Covid-19 pandemic. Method: In this study, new features and some previous work features were e...
متن کاملMeasuring the Similarity of Trajectories Using Fuzzy Theory
In recent years, with the advancement of positioning systems, access to a large amount of movement data is provided. Among the methods of discovering knowledge from this type of data is to measure the similarity of trajectories resulting from the movement of objects. Similarity measurement has also been used in other data mining methods such as classification and clustering and is currently, an...
متن کامل